Cherry-pick #1728 to r0.4.0 (Qwen refs removed)#2030
Closed
Conversation
* feat: Add diffusion pipelines for nightly runs Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * Reduce ci runtime to 30 minutes Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * debug: Check if HF_TOKEN is set Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * test: revert test variables Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * feat: add HunyuanVideo nightly CI test and parameterize diffusion launcher Add HunyuanVideo-1.5 to the diffusion finetuning CI pipeline alongside Wan2.1. Parameterize the launcher script to derive model-specific settings (processor, generate config, model name, frame counts) from the recipe config name. Also fix a pre-existing T5 layer norm compatibility issue in finetune.py that affects Hunyuan training with incompatible apex builds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * style: ruff format on modified files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * revert: remove patch_t5_layer_norm from finetune.py The patch was a workaround for an ABI-incompatible apex build on a specific compute node, not a code issue. CI Docker builds apex from source so it is not needed there. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * feat: add Flux and QwenImage T2I nightly CI tests Extend the diffusion nightly CI pipeline to support text-to-image models (Flux and QwenImage) alongside the existing text-to-video models (Wan, HunyuanVideo). Uses the diffusers/tuxemon dataset for image CI smoke tests. Changes: - Add MEDIA_TYPE branching in launcher for image vs video stages - Add tuxemon dataset download/extraction with JSONL captions - Add image preprocessing and .png inference verification paths - Add ci: sections to flux_t2i_flow.yaml and qwen_image_t2i_flow.yaml - Register QwenImagePipeline in generate.py output type mapping Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> --------- Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-authored-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Manual cherry-pick of #1728 (
feat: Add diffusion finetuning CI pipeline for nightly runs) ontor0.4.0, with all QwenImage-specific additions removed. The underlying Qwen-Image support (#1704, #1976) is not onr0.4.0, so wiring CI for it would be half-implemented and non-functional.What is excluded (vs. the original #1728)
examples/diffusion/finetune/qwen_image_t2i_flow.yaml— not created onr0.4.0"QwenImagePipeline": "image"entry inexamples/diffusion/generate/generate.py- qwen_image_t2i_flow.yamlentry intests/ci_tests/configs/diffusion_finetune/nightly_recipes.ymlqwen_image_t2i_flow*)case block intests/ci_tests/scripts/diffusion_finetune_launcher.shWhat lands
CI infra for the models already supported on
r0.4.0: Wan2.1, HunyuanVideo, Flux. New files:tests/ci_tests/configs/diffusion_finetune/nightly_recipes.ymltests/ci_tests/configs/diffusion_finetune/override_recipes.ymltests/ci_tests/scripts/diffusion_finetune_launcher.shPlus CI sections appended to the existing Wan/Hunyuan/Flux recipe yamls, and updates to
tests/ci_tests/utils/generate_ci_tests.py.Test plan
r0.4.0with the cherry-pick🤖 Generated with Claude Code